Predictive Anonymization: Utility-Preserving Publishing of Sparse Recommendation Data

نویسندگان

Chih-Cheng Chang

Brian Thompson

Danfeng Yao

چکیده

Recently, recommender systems have been introduced to predict user preferences for products or services. In order to seek better prediction techniques, data owners of recommender systems such as Netflix sometimes make their customers’ reviews available to the public, which raises serious privacy concerns. With only a small amount of knowledge about individuals and their ratings to some items in a recommender system, an adversary may easily identify the users and breach their privacy. Unfortunately, most of the existing privacy models (e.g., k-anonymity) cannot be directly applied to recommender systems. In this paper, we study the problem of privacy-preserving publishing of recommendation datasets. We represent recommendation data as a bipartite graph, and define several attacks on the graph that can re-identify users and determine their rated items and ratings. To deal with these attacks, we give formal privacy definitions in recommender systems. We develop a robust and efficient anonymization algorithm, Predictive Anonymization, to achieve the privacy goals. Our experimental results show that Predictive Anonymization can prevent the attacks with very little impact to prediction accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling

In recent years, privacy concerns about social network graph data publishing has increased due to the widespread use of such data for research purposes. This paper addresses the problem of identity disclosure risk of a node assuming that the adversary identifies one of its immediate neighbors in the published data. The related anonymity level of a graph is formulated and a mathematical model is...

متن کامل

Utility-preserving anonymization for health data publishing

BACKGROUND Publishing raw electronic health records (EHRs) may be considered as a breach of the privacy of individuals because they usually contain sensitive information. A common practice for the privacy-preserving data publishing is to anonymize the data before publishing, and thus satisfy privacy models such as k-anonymity. Among various anonymization techniques, generalization is the most c...

متن کامل

Preserving Privacy and Utility in RFID Data Publishing

Radio Frequency IDentification (RFID) is a technology that helps machines identify objects remotely. The RFID technology has been extensively used in many domains, such as mass transportation and healthcare management systems. The collected RFID data capture the detailed movement information of the tagged objects, offering tremendous opportunities for mining useful knowledge. Yet, publishing th...

متن کامل

ارایه یک روش جدید انتشار داده‌ها با حفظ محرمانگی با هدف بهبود دقّت طبقه‌‌بندی روی داده‌های گمنام

Data collection and storage has been facilitated by the growth in electronic services, and has led to recording vast amounts of personal information in public and private organizations databases. These records often include sensitive personal information (such as income and diseases) and must be covered from others access. But in some cases, mining the data and extraction of knowledge from thes...

متن کامل

A Knowledge Model Sharing Based Approach to Privacy-Preserving Data Mining

Privacy-preserving data mining (PPDM) is an important problem and is currently studied in three approaches: the cryptographic approach, the data publishing, and the model publishing. However, each of these approaches has some problems. The cryptographic approach does not protect privacy of learned knowledge models and may have performance and scalability issues. The data publishing, although is...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Predictive Anonymization: Utility-Preserving Publishing of Sparse Recommendation Data

نویسندگان

چکیده

منابع مشابه

An Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling

Utility-preserving anonymization for health data publishing

Preserving Privacy and Utility in RFID Data Publishing

ارایه یک روش جدید انتشار داده‌ها با حفظ محرمانگی با هدف بهبود دقّت طبقه‌‌بندی روی داده‌های گمنام

A Knowledge Model Sharing Based Approach to Privacy-Preserving Data Mining

عنوان ژورنال:

اشتراک گذاری